-
Notifications
You must be signed in to change notification settings - Fork 387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support for hash scheduling #1260
Conversation
I'll try and have a closer look this weekend, but at the very least I'd like to see some accompanying unit tests. |
Hi @yvanoers, I had not seen that PR no, but (as mentioned in the original description) Jenkins was my motivation indeed. This is actually a PR I've had in my ToDo list since I started using dkron 2 years ago, as I used Was there any highlights/caution you wanted me to notice there? My understanding, now that I read it, is that everyone agreed with the approach. Definitely missing unit tests, I'll take care of those Note: I'm already using this in my fork, so far so good! |
Hi @fopina ! |
ah got it @yvanoers ! I think it would make sense to make it available everywhere though… cron library cares about schedule nothing else. There’s not enough entropy in the schedule string to use it as the hash meaning a new non-sense parameter would be added to the library method just so it would do the “strings.replace” anyone can easily do before calling it. So even though I think it would make sense, I don’t think it does in the end 😄 but happy to agree to disagree and closing this PR! |
There are no high expectations of this getting included in the cron scheduler. Will finish code review hereafter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking that doing this implementation correctly while preventing any undesired/unexpected behavior may need to be done by parsing the schedule in greater detail.
Which is why the cron scheduler itself is arguably the best place for this functionality. Otherwise it may be necessary to parse the schedule twice, which feels wasteful to me.
I'm happy to be proven wrong though :)
Also: We're not against including a modified version of the entire cron scheduler in Dkron. It's been so before. Ideally we wouldn't have to, though, but that is probably not feasible in the short term.
dkron/job.go
Outdated
// such as "0 0 H * * *" | ||
func (j *Job) scheduleHash() string { | ||
spec := j.Schedule | ||
if strings.Contains(spec, "H") && strings.Count(strings.TrimSpace(spec), " ") == 5 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cron format allows for a timezone setter as a prefix, e.g. 'TZ=Europe/Madrid 0 0 H * * *'.
This case isn't accounted for.
Keep in mind that there are timezones that have an 'H' in them (especially relevant in line 334).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactored a bit to handle TZ
CRON_TZ
and @...
, thanks for the heads up!
dkron/job.go
Outdated
func (j *Job) scheduleHash() string { | ||
spec := j.Schedule | ||
if strings.Contains(spec, "H") && strings.Count(strings.TrimSpace(spec), " ") == 5 { | ||
h := 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you write out the actual variable name, hash
I'm guessing it is supposed to be represent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, hope those look clearer now
dkron/job.go
Outdated
} | ||
parts := strings.Split(spec, " ") | ||
for index, part := range parts { | ||
if strings.Contains(part, "H") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has the side effect of allowing an expression such as */H
be accepted and processed.
Is that intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is intended yes, not because of */H
but for H/5
(every 5min starting at H
).
Not sure if */H
would be of any use, but I don't think it's worth actively excluding it as */55
(every 55minutes) is still valid cron expression 🤷
What do you think?
dkron/job.go
Outdated
ph := h | ||
switch index { | ||
case 2: | ||
ph %= 24 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not all days are 24 hours long. Not sure how the Jenkins scheduler handled that, but it is worth thinking about and documenting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understood? I'm probably missing when are days not 24h long.
Also, that doesn't seem to be an "hash" issue since most people setting up jobs to run at specific hour every day, will choose an hour out of 24...
The reference for these mods are the official dkron documentation for cron expressions, hour part is 0-23
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the day not being 24h long is a bit besides the point - sorry.
My point was daylight savings time:
When daylight savings time kicks in, for timezones that have that, 2 AM is skipped over: 1:59AM -> 3:00AM. If something is scheduled to jitter around 3AM, it may decide to pick 2AM on a day that does not have 2AM.
The other way around, when coming back from daylight savings time, you have 2AM twice: 2:59AM -> 2:00AM.
This is why it is generally a good idea to run scheduling servers on UTC time. Running a scheduler with a timezone that can jump back and forth in time is risky. Hence, worth thinking about and documenting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh nice point on DST! Good stuff to keep in mind for many things!
Personally I have every server in UTC; not just cronjobs, because I never know which apps write logs with UTC or localtime and then depending on the visualization tool, again the same issue, so I'm totally onboard with the UTC recommendation.
That being said, I don't think I agree that it should be worth thinking and documenting "within this PR" but rather at the scheduler level (maybe not even dkron, but the cron lib).
This PR only adds a special character that will resolve to number. And it will resolve for valid values of the field where it's used.
If there are values that might result in bad behavior, it's still the same as if the user would choose that specific value himself (instead of using H
/~
).
I agree that we should avoid cases that make sense (like you mentioned for months, hope you agree with using 28
as limit), but for dealing with DST and the issues you mention, "once a day" schedule will be subject to them, regardless of using 0 0 H * * *
or 0 0 3 * * *
.
So I think it's a valid caution/recommendation to note somewhere, but not in the scheduleHash
specifically, as the risk is not introduced by using it.
What do you think?
dkron/job.go
Outdated
case 2: | ||
ph %= 24 | ||
case 3: | ||
ph = (ph % 31) + 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not all months are 31 days long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While for hours I didn't understand your point, I do in this one 😄
Updated max to 28
🙇
Curious cron library would handle choosing day 31 for a month of 30, as anyone can specify that in a spec, will it skip a month?
Thank for taking the time for explaining @yvanoers , I do understand your point much better now. Still I'm not sure how to introduce the hashable parameter without making it look nonsense. But I'll definitely take a look first then! |
I was looking into updating my fork 3.2.3 to latest 3.2.7 and couldn't remember why I had a fork, only to go through history and find this feature was the reason 🤦 I didn't look into opening PRs to cron scheduler but re-reading this thread I believe it's agreed it would be unlikely to happen anyway. Should I try to re-implement this with something other than Or simply going with a regexp |
That may be worth exploring. |
To support hash scheduling strategies, should we consider implementing a new nodeSelector? Then introduce a new configuration item to switch the scheduling strategy? The default nodeSelector strategy is random dispatch. Line 33 in f114610
Lines 757 to 760 in f114610
|
dd2fb04
to
af8a649
Compare
2f6704b
to
d3b24ea
Compare
@yvanoers thank you for all the feedback. I've also updated Hope it looks better now! Sadly, I couldn't reply to review comments within a single review (sorry for the spam), because of the force push I had to do when switching the PR to v4, but it felt like a required step as well ⚡ Also added a small unit test |
@yvanoers any chance for review? :) |
is it worth the effort to rebase this PR with v4? |
Isn't it already targeted at v4 ? |
No, I don’t think there was a v4 when this branch was created |
Well, you changed the base from main to 4.x 7 months ago and did a couple force pushes. So if you didn't rebase to the 4.x branch back then already, I would think is does make sense to rebase it. |
Hey this somehow keeps escaping my radar sorry, I'll take a look and get back to you, it looks like it'll merge right away. @fopina are you already using v4? |
@yvanoers you're right, i probably rebased and can’t remember @vcastellm looks like it indeed, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but can you add docs in https://github.com/distribworks/dkron/blob/4.x/website/docs/usage/cron-spec.md ? thx
right, sorry for missing that @vcastellm, done now 👍 |
website/docs/usage/cron-spec.md
Outdated
@@ -51,6 +51,10 @@ Question mark ( ? ) | |||
Question mark may be used instead of '*' for leaving either day-of-month or | |||
day-of-week blank. | |||
|
|||
Tilde ( ~ ) | |||
|
|||
Tilde will be replaced by a numeric value valid for the range where it is used. It allows periodically scheduled tasks to produce even load on the system. For example, scheduling multiple hourly jobs to "0 H * * * *" rather than "0 0 * * * *" will run the jobs at different minutes of every hour. It can be thought of as a random value over a range, but it actually is a hash of the job name, not a random function, so that the value remains stable for any given job. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be "0 ~ * * * *"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦 inspired myself in jenkins docs and forgot to replace!
add support for hash scheduling (distribworks#1260)
* Convert shell plugin to internal plugin (#975) The purpose of this PR is to embed the shell plugin in the main dkron binary, that will facilitate creating a single binary with the most important executor, for easy deployment using a single binary. * Remove old UI (#984) * Release a light image tag (#988) Omit all plugins except the shell plugin that will be included in the main binary. * Upgrade react admin to v4 (#1436) * Use exponential backoff for retries (#1433) * Handle ip changes (#1446) * consul like approach: add server_lookup to make dkron independent from server node IP on raft layer * handle memberupdate event * query other servers before start & add test * don't remove itself if node is a raft leader * don't remove dkron server node if id matches * Move config init to agent command (#1465) Config init was being done on the root level command but only the agent command was using config values. Now config init is done as pre-run of agent command only, getting rid of extra messages in other commands when the config was missing. * Fix disabled (#1467) * Embed http plugin (#1471) http plugin is one of the most used plugins together with the shell plugin, embedding it in the main binary allow for more lean deployment. * Show all Job fields * Refactor buttons for admin v4 * Docs v4 (#1473) * Reuse http clients with the same configuration (#1474) * Update OpenAPI spec to v3.1.0 and add ACLs spec * Refactor location of types as they not only belong to plugins, but used in general. (#1485) Also add Pro protobuf and gen code as this should be public. * Generate client * Extend protobuf defs for ACLs * Token fields * Minor improvement in RabbitMQ executor (#1500) * nice and forkable goreleaser (#1584) * add support for hash scheduling (#1260) * feat: Login form and admin update (#1589) Upgrade react-admin to v5 Implements a login form to use in Pro version --------- Co-authored-by: Victor Castell <[email protected]> Co-authored-by: Ivan Kripakov <[email protected]> Co-authored-by: Filipe Pina <[email protected]>
This PR adds support for Jenkins-style hash schedule spec such as
H
is replaced by something derived from job name and moduled to its position within the spec (such asH mod 60
for minutes butH mod 24
for hour).This allows for easily scheduling many daily jobs with same cron spec while still mildly balancing them throughout the day.
I tried to do it directly in the dkron
Scheduler
but it gets tricky due to the need of a second parameter that is not part of cron.Scheduler interface.Also, the
hash
itself is actually just summing up the character values for simplification. It could be improved with an actual hashing method with less collisions and better distribution but in the end it would still use modullus 7, 12, 24, 60 causing many collisions.As this is not for security but only mildly load balancing, I left as is.
Let me have feedback!